Usefulness, localizability, humanness, and language-benefit: additional evaluation criteria for natural language dialogue systems

نویسندگان

  • Bayan Abu Shawar
  • Eric Atwell
چکیده

Human-computer dialogue systems interact with human users using natural language. We used the ALICE/AIML chatbot architecture as a platform to develop a range of chatbots covering different languages, genres, text-types, and user-groups, to illustrate qualitative aspects of natural language dialogue system evaluation. We present some of the different evaluation techniques used in natural language dialogue systems, including black box and glass box, comparative, quantitative, and qualitative evaluation. Four aspects of NLP dialogue system evaluation are often overlooked: “usefulness” in terms of a user’s qualitative needs, “localizability" to new genres and languages, "humanness" compared to humanhuman dialogues, and "language benefit" compared to alternative interfaces. We illustrated these aspects with respect to our work on machine-learnt chatbot dialogue systems; we believe these aspects are worthwhile in impressing potential new users and customers.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dialogue Annotation for Language Systems Evaluation

The evaluation of Natural Language Processing (NLP) systems is still an open problem demanding further research progress from the research community to establish general evaluation frameworks. In this paper we present an experimental multilevel annotation process to be followed during the testing phase of Spoken Language Dialogue Systems (SLDSs). Based on this process we address some issues rel...

متن کامل

Usability Evaluation In Spoken Language Dialogue Systems

The paper first addresses a series of issues basic to evaluating the usability of spoken language dialogue systems, including types and purpose of evaluation, when to evaluate and which methods to use, user involvement, how to evaluate and what to evaluate. We then go on to present and discuss a comprehensive set of usability evaluation criteria for spoken language dialogue systems.

متن کامل

Designing and evaluating a wizarded uncertainty-adaptive spoken dialogue tutoring system

We describe the design and evaluation of two different dynamic student uncertainty adaptations in wizarded versions of a spoken dialogue tutoring system. The two adaptive systems adapt to each student turn based on its uncertainty, after an unseen human “wizard” performs speech recognition and natural language understanding and annotates the turn for uncertainty. The design of our two uncertain...

متن کامل

Description Logics for Natural Language Processing

This paper surveys in short the activity of the Knowledge Representation and Reasoning group at IRST for Natural Language Processing. We have developed two Description Logic based systems to be used in large Natural Language dialogue architectures. The functional interaction of such KR systems with the other modules is briefly described. Then, several qualifying extensions of the basic systems ...

متن کامل

Natural language generation for spoken dialogue

A natural language generation module for spoken dialogue systems has been developed that performs three steps: generating multiple versions of an utterance, choosing the best version by a set of criteria, and annotating the text using structural information accumulated during the generation process. The requirements of spoken output is catered for by several design decisions.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • I. J. Speech Technology

دوره 19  شماره 

صفحات  -

تاریخ انتشار 2016